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1. Introduction 

We consider here a problem of finding necessary and sufficient conditions for the boundedness 
of two weight Calderon-Zygmund operators. We give such necessary and sufficient conditions 
in very natural terms, if the operator is the Hilbert transform, and the weights satisfy some 
very natural condition. This condition might even happen to be necessary for the two weight 
boundedness of the Hilbert transform. This, of course, would be wonderful, because if this 
really happens, our result would give necessary and sufficient condition for the two weight 
boundedness of the Hilbert transform for two arbitrary weights (measures). However we 
cannot either prove or disprove the necessity of our condition (called the pivotal condition 
in the text below). But we definitely know that it is satisfied for doubling measures. Thus 
we reprove the results from [37]. We also indicate some other situations when our pivotal 
condition is automatically satisfied and is easily verified. 

Our necessary and sufficient conditions of the boundedness seem to be quite natural. 
Actually in a one weight (one measure) case they become a famous Tl conditions under 
small disguise. We discuss why our conditions are exactly the correct generalization of Tl 
conditions of David- Journe from one measure case to two measure case a bit later. Now 
we just want to warn the reader that even in the one measure case considered by David- 
Journe [H], [TJ] , they really used that their measure is Lebesgue measure. More general Tl 
theorems were proved by Christ [5] , these were again one measure Tl theorems, but now the 
measure under consideration was allowed to be arbitrary measure with doubling condition 
(homogeneous measure by the widespread terminology). It took considerable efforts to get 
rid of the last assumption. Now Tl theorems for one measure exist, and they do not use 
homogeneity. This is the scope of non-homogeneous Harmonic Analysis, and we refer the 
reader to [21] P! and to pE], [UJ. 

The reader will see what we understand by Tl theorem for two measure a bit later, but 
now we can already say that this will be Sawyer type test conditions. In other words, our 
Tl generalization amounts to just testing the operator T (and its adjoint) on characteristic 
functions of cubes (intervals) exactly as this has been done by Sawyer in the series of works 
[H]-[45j, which appeared at approximately the same time as David- Journe's Tl theory. 

Of course, the difference between Sawyer's works and David- Journe's works is that he 
considered two weight situation, and they considered one weight situation (actually Lebesgue 
measure situation), on the other hand David- Journe considered singular operators, while 
operators considered by Sawyer were not singular, these were the operators with positive 
kernels. But strangely enough, to the best of our knowledge, it was not a common place 
that Tl conditions are identical to Sawyer's test conditions! 

Maybe nobody (as far as we can say) made a point by saying that these are two equivalent 
assumptions because they were applied in different situations. But let us confirm that the 
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test conditions of Sawyer and Tl conditions of David- Journe are actually identical. Suppose 
T is an operator with a Calderon-Zygmund kernel k(x,y) (we will assume that the kernel 
is antisymmetric a bit later, this is only for the sake of brevity of reasoning). We have two 
seemingly different claims: 

• Tl G BMO, 

• VQ ||T X q||£ 2(q) <c|Q|. 

The first claim is of course David- Journe's Tl-theorem assumption, the second one is one 
of several possible Sawyer's test type conditions. Of course they are equivalent and in a 
trivial way. In fact, let us assume the Sawyer's test condition. To prove that Tl G BMO 
we need to fix any cube Q, then we consider 1 = XR n \2Q + X2Q decomposition and apply 
T to it. By the Calderon-Zygmund property of the kernel one immediately gets (see [14J) 
that function / x := Txm. n \2Q satisfies J Q \ fi — Cq\ 2 cIx < c\Q\ for a certain constant cq. But 
the Sawyer's test condition gives that f Q \Tx2Q\ 2 dx < c\2Q\ < C\Q\. Then immediately 
j Q |T1 - c Q \ 2 dx < C\Q\, and this means Tl G BMO. 

On the other hand, if we assume first David- Journe's condition Tl G BMO then the 
same decomposition brings us the claim that J" \Tx2Q — CQ\ 2 dx < C\Q\ for a constant 
cq — jgj JqTx2Q- Then one can estimate this constant easily. This is especially easy for 
Calderon-Zygmund operators with antisymmetric kernel (k(x,y) = —k(y,x)), which is the 
leading interesting case anyway. So let us consider only antisymmetric T's. And for them 
obviously \cq\ = \-^r J Rn XQTx2Q\Qdx\. We put the absolute value inside the integral now and 

use the roughest possible estimate of Calderon-Zygmund kernel k, \k(x, y)\ < , which 
gives \c Q \ < C. Then J Q \Tx2Q - c Q \ 2 dx < C\Q\ and \c Q \ < C give us J Q \Tx2Q\ 2 dx < C\Q\. 
Now to say J" \TxQ\ 2 dx < C\Q\ we need the estimate J Q \Tx2Q\Q\ 2 dx < C\Q\, which is 
immediate again from the same roughest estimate of Calderon-Zygmund kernel. 

Our paper is devoted to two weight case. And we believe that Sawyer's test condition 
(which we just showed to be equal to Tl condition of David-Journe in a one weight case) 
gives the correct generalization of Tl theorems to the two weight (two measures) situation. 
So for us the two weight Tl theorem is just the result, which says that a singular operator 
is bounded from one L 2 to another L 2 if and only if being tested on characteristic functions 
it is uniformly bounded — exactly as in some Sawyer's test conditions. 

For a certain model Calderon-Zygmund operators we prove exactly this type of Tl the- 
orem in [3H], These are certain dyadic Calderon-Zygmund operators, and we give in 
|38j . [SB] a necessary and sufficient conditions on weights, without assuming anything, for 
these operators to be bounded between two different weighted L 2 spaces. In particular, in 
|38j . [36] we treated a two weight Tl theorem for the so-called Martingale Transforms, which 
are sometimes considered as a dyadic version of the Hilbert transform. It is known that the 
Hilbert Transform and the Martingale Transform are very closely related, see, for example, 

CP- ®- 

Notice that the two weight problem for singular operators seemed to be extremely diffi- 
cult, adequate tools seemed to be not available. The theory of nonhomogeneous Calderon- 
Zygmund operators, as we will see below, at least gives a considerable hope to understand 
such two weight problems. 
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The interest to two weight problems for singular operators naturally appears from an 
attempt to understand when the operator in the Hilbert space has an unconditional spectral 
decomposition. Due to Wermer [51], the following rigidity claim holds: this unconditional 
spectral decomposition exists for T if and only if T = S^NS, where N is a normal operator, 
and S is invertible (similarity). This similarity to normal operator question got a large 
attention recently for different classes of T. We mention here [I], [28], [23], [2T], [2H], [30J. 
If T is a small perturbation of a unitary operator (even a rank one perturbation), then in 
general the criteria of similarity with a normal operator is more or less totally open. Even 
if T is a contraction, the relation between the spectral data of U and N is very subtle 
in general. This kind of questions very fast become related to two weight problems for 
the Cauchy transform, as illustrated by [28] . For example, [28] is based on a remarkable 
example of Fedor Nazarov, which says that Hunt-Muckenhoupt-Wheeden criterion for one 
weight boundedness of the Hilbert transform is not applicable in two weight situation. The 
reader will find more details in [29], [30] and in Section [2J [3j 

An unexpected application of two weight Hilbert Transform was awaiting in a problem 
from spectral theory of almost periodic Jacobi matrices, see [S3], [ID]- In these papers the 
singular continuous spectrum of a wide and natural class of Jacobi matrices got related to 
the properties of a certain Gehktman-Faybusovich flow (see [IB]), which is not unlike a well- 
known Toda flow of Jacobi matrices. In its turn the uniform boundedness in this flow turns 
out to be exactly equivalent to a certain two weight Hilbert transform problem. 

Finally, let us explain what is the main difficulty of the theory of nonhomogeneous 
Calderon-Zygmund operators. Roughly speaking the difficulty appears as a result of a cer- 
tain degeneracy in the operator. We can evoke the vague analogy with subellipticity in PDE. 
In our case, the degeneracy appears not in the kernel of the operator-the kernel is a classical 
Calderon-Zygmund kernel-but in underlying measure. To illustrate, what kind of difficulty 
persistently appears let us think that we need to estimate the quantity 



Three possibilities can logically occur: 1) to estimate k in L°° (may be after using some 
sort of cancellation), and to estimate / in L 1 (/i), g in L l (fi); 2) to estimate k in L 1 L°° (this 
is a mixed norm, L l in the first variable, L°° in the second one), and to estimate / in L°°(/x), 
g in L 1 (/i); 3) to estimate k in L l , and to estimate / in L°°(ji), g in L°°. 

In the first case no difficulty appears. We need to bound I in (II. ip by ||/| \ L 2 ||<?||i,2, and 
this is not a problem, by Cauchy inequality < ^(QY^W/Wl 2 ^ IMIl 1 < ^(RY^WdWi 2 - 

Suppose we want to repeat something like that in the second case. First of all L°° 
norm cannot be estimated by L 2 one. But this is not the difficulty (strangely), because in 
expression / usually /, g are very simple, basically constant functions on Q, R. In this case we 
have the desired estimates: ||/||l°° < ll/IU 2 , II^IIl 1 < M-^^llfllU 2 - Subsequently, 

we get the expression ^^ 1/2 . This is a not so nice an expression because measure of a (small) 
set Q stands in the denominator. For good measures (for example for Lebesgue measure) we 
have a control of these "small denominators" . But for an arbitrary measure, the denominator 
can be arbitrarily small, or even vanishing. The only hope is that R C Q in all such cases. 




Q JR 



(1.1) 
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But this is not so usually. Usually the mutual position of Q, R is quite arbitrary. In the 
third case there are two small numbers in the denominator. This is even worse. So we are 
bound for the disaster if we reduce the estimate of the operator with kernel k to estimates 
of sums of type /. But actually this is exactly the most natural way of estimating Calderon- 
Zygmund operators. So to avoid this disaster we have to avoid bad mutual positions oiQ,R. 
This goal is attained by considering random decomposition (with respect to random dyadic 
lattice) of our functions and averaging procedure. This randomness compensates for the 
degeneracies of the measure because it "smoothens up" the degeneracies, (but however, not 
in a strict sense of this word) . In another context the random dyadic lattice of course already 
appeared in harmonic analysis, in [T7], for example. Decomposition of functions to estimate 
Calderon-Zygmund operator is not something new either, see [TT]. But the combination of 
these two ideas is what allows to win over degeneracies of measures. The machinery of this 
is represented below. Along with two applications (mentioned already) of this technique. 



2. two weight estimate for the hllbert transform. 

Preliminaries. 

We start now the development of two weight estimates for some Calderon-Zygmund oper- 
ators. The technique for degenerate (nonhomogeneous) cases of Tb theorem (see |31]-|34]) 
seems to work very well also for this quite intriguing problem from the theory of Calderon- 
Zygmund operators. 

Let us recall a little bit of the history of the problem. For some time we will be mentioning 
only the Hilbert transform- the common model of a Calderon-Zygmund operator. In 1960 
Helson and Szego in [19] described the weights such that, say, for all / smooth with compact 
support on the real line M. n 

[ \Hf\ 2 wdx<C [ \f\ 2 wdx. (2.1) 
where the Hilbert transform H is defined as follows 

Hf(x) :=-p.v. [ l®-dt;= lim - / dt . (2.2) 

TT J R X~t e^0+ TT J t .\ t _ x \> e X - t 

Here is the description of Helson and Szego: the weight satisfies (12 .ip if and only if 

71 

logw = u + Hv, u,v E L°°, |M|oo<-- (2.3) 

In 1971 a new description of such weights appeared. This description was due to Hunt, 
Muckenhaupt and Wheeden [20], and it was in totally different terms: 

Q w := snp(w) I (w~ 1 } I < oo . (2.4) 

ICS. 

Here / run over all finite intervals of the real line. It took some time to find the correct 
analog of this result in vector- valued situation (matrix weights), this has been done in [18] 
and [51 J only in the 90's. Note that so far there is no direct proof that (12.41) implies (12. 3p . 
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Of course the problem with two weights attracted the attention. The problem is to 
describe the pairs of nonzero weights such that 

HF\ 2 vdx<C [ \F\ 2 udx. (2.5) 

There is a vast literature about the two weight problems. Now we mention only the works 
of P. Koosis [21], [25]. 

One weight inequality became very important because of its relations with the theory 
of Toeplitz operators and with the spectral theory of stationary stochastic processes, see 



Two weight inequality first attracted the attention because of its obvious relation to 
the one weight counterpart. But recently it became clear that it can be very essential in 
perturbation theory of unitary and self-adjoint operators and in spectral theory of Jacobi 
matrices. In particular, the question, when the a rank one perturbation of a unitary operator 
is similar to a unitary operator, is essentially the question about the two weight estimate 
of the Hilbert transform, see [28], for example. Subtle questions about the subspaces of the 
Hardy class H 2 invariant under the inverse shift operators also are essentially the questions 
about the two weight Hilbert transform, see [30J. And at last, see [53], [ID] how the two 
weight Hilbert transform appears naturally in certain unsolved questions concerning the 
orthogonal polynomials and spectral theory of Jacobi matrices. 

Let us formulate the two weight Hilbert transform problem in the form, which is more 
convenient to us than (12. 5p . Let /i, v be two positive measures on R. We define the Hilbert 
transform from L 2 (fi) to L 2 (u) as any bounded linear operator from L 2 (fi) to L 2 [y) such 
that 

1 f fH) 



HJ(x) := - / dfJ ,(t), \/xeR\ supp(/) . (2.6) 

IT J R X-t 

Such an operator is not uniquely defined. But we will prove the main result for all such 
operators. Notice that the adjoint H^* is just —H u , it is also just a Hilbert transform in our 
sense (up to a minus sign). 

Let us change the variables in ( 12. 5p : d/i := ^ dx, F := dv := vdx. Then ( 12. 5p transforms 
itself into 

r \HJ\ 2 dis<C I \f\ 2 dfi. (2.7) 



/ 



A very subtle point is that we are not interested when ( 12. 7p holds with the same finite 
C for all / in L 2 (/i). We already assumed by definition that this is the case. What we are 
interested in are some simple characteristics computable by means of fi and u, and such that 
C can be estimated by these characteristics. An example of such characteristic is 

Qn,v := supM/M/ := sup ^TTTTT ■ ( 2 - 8 ) 

ICR ICR Ml Ml 

This is a total analog of Q w from [20J. In fact, in a one weight case u = v = w of (12. 5p . we 
have dp, — ^dx,du = wdx, and so Q^ v becomes Q w . We will see soon that 

Ql /2 „<A\\Hj LH ^ LHu) . (2.9) 
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Of course, we are interested in a sort of opposite estimate. After all, Hunt-Muckenhoupt- 
Wheeden theorem from [20] says that the finiteness of Q w is equivalent to the boundedness 
the corresponding Hilbert transform. Moreover, recently S. Petermichl [?2] proved that 

\\H^\\l 2 (^l 2 (u) < AQ^^ = AQ W 

in a one weight case, that is when d/j, — — dx, dv = wdx. See also [13], where this is proved 
for the Ahlfors-Beurling transform instead of the Hilbert transform. 

However in a two weight case nothing like the full analog of Hunt-Muckenhoupt-Wheeden's 
result (not even to mention [12] or [13]) is possible. Strangely enough, this has been under- 
stood only recently due to the work of F. Nazarov [29]. See also [28], [3D] . 

At any rate Q^ u will be an important characteristic of "Hunt-Muckenhoupt-Wheeden" 
type, which will play an important part in estimating \\H^\\ L 2^^ L 2^y The only thing we 
have said is that it alone is not sufficient. One has to look for other /i, //-quantities. 

It is important to mention that unlike the "Hunt-Muckenhoupt-Wheeden" type charac- 
teristics of two measures (two weights), the "Helson-Szego" type characteristics were found 
long ago. This has been done in the papers of Cotlar and Sadosky [Z]-[8]. Paper [6] gives 
another equivalence to Helson-Szego condition. Papers [H]-[ID] also treat the Helson-Szego 
type theorem in LP for the case p ^ 2. 

From what we described above it becomes clear that we are after "Hunt-Muckenhoupt- 
Wheeden" type characteristics of two measures (two weights), which, together with charac- 
teristic Q^ v will allow us to estimate \\H^\\ L 2^^ L 2^ u y 

The difficulty is twofold. First of all, two weight problems have a huge degree of freedom 
with respect to rather rigid one weight problems. This is why one quantity Q is not sufficient. 
Secondly we are dealing with singular operator. Singular kernels are much more difficult to 
deal with than positive kernels. May be for operators with positive kernels the two weight 
problems are easier approachable? It has been found in the mid 80's that this is the case. 

E. Sawyer was the first who fully characterized the boundedness of several important 
operators with positive kernels between two weighted spaces. This concerned in particular 
maximal operator and Carleson imbedding theorem. The reader is referred to [2], [45] . 
[22] and also to [38], where Sawyer's results got Bellman function explanation. Sawyer's 
conditions were simple and beautiful, they were in a sense of "Hunt-Muckenhoupt-Wheeden" 
type. But actually their meaning was very transparent: 

a fairly general operator with positive kernel is bounded between two weighted L 2 

spaces if and only if it is uniformly bounded on a system of simple test functions 

and the same holds for its adjoint . 

It is usually enough to take the characteristic functions of the intervals (cubes) as the family 
of test functions. 

This was a remarkable discovery. Actually almost at the same time a series of works 
of G. David and J.-L. Journe appeared, devoted to the so-called Tl theorems. Here the 
main object was singular operators (kernel changes the sign), more precisely, Calderon- 
Zygmund operators. The answer (these Tl theorems) was in the same spirit: check T and 
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T* on characteristic functions of intervals. But unlike the case considered by Sawyer, these 
problems of David and Journe were one weight problem. In the following sense: given the 
operator with Calderon-Zygmund kernel k, bounded in L 2 (fi), one looks for characteristics, 
which allow to estimate the norm of this operator. The phrase "given the operator T with 
Calderon-Zygmund kernel k" means that we are given a Calderon-Zygmund kernel K and 
positive measure fi (say in IR n ) so that 

TJ(x):= [ k(x,t)f(t)dfi(t) VxGM n \supp(/). (2.10) 

There are many such operators of course. But David- Journe were looking for characteristics 
which give the bound on the norms of all such operators, meaning the norms from L 2 ([i) to 
the same L 2 (fi). This is why we call such problems one weight problems, they concern the 
estimate of ||T M : L 2 (fi) — > L 2 (fi)\\. Notice that here one weight problem means something 
quite different than in Hunt-Muckenhoupt-Wheeden theorem. There one deals with WH^ : 
L 2 (/x) — > L 2 (z/)||, but for a very special case: v = wdx,[i = —dx. 

The last important remark is that the theory of David- Journe (usually united under the 
name "Tl theorems") originally concerned only one measure /i, namely, Lebesgue measure 
in W 1 : dfi = dx. It was noticed that for doubling measures one can construct a series of Tl 
theorems. This has been done in a paper by M. Christ [5] . The doubling property seemed to 
be a cornerstone of David- Journe-Christ theory of Calderon-Zygmund operators. However, 
a strong need to get rid of this cornerstone appeared from the attempt to solve Vitushkin's 
problems. See the reviews (SO], P2], [27], [52J. 

Summarizing all this: one weight problem (in both senses indicated above) are difficult, 
but basically solved for both Calderon-Zygmund operators and for the operators with positive 
kernels. 

Two weight problems are considerably more difficult, but basically solved for the wide 
class of operators with positive kernels. 

We consider the worst of both worlds. Our operators will be singular (we consider just 
some model, for example, the Hilbert transform or the Martingale transform) and instead 
of one weight problem we consider two weight problem. This is why we need all the tricks 
from [33] [35] dealing with nonhomogeneous Tl and Tb theorems. 

Here is our main result concerning two weight Hilbert transform. It uses almost fully 
the box of tools we applied in previous papers [31] [35] to construct a nonhomogeneous 
version of Calderon-Zygmund theory (criteria for the boundedness of a CZ operator T^ : 
L 2 (fi) L 2 (fi) for nondoubling /i). The huge drawback of what has been done in [37] 
is that we were obliged to impose the doubling conditions on /i, v if we want to prove a 
simple Hunt-Muckenhoupt-Wheeden (actually Sawyer) type result on boundedness of : 
L 2 (fi) — y L 2 [y). This unwelcome but returning doubling assumption is probably not needed: 
the result should be true in general. But the huge difficulty of two weight estimate for 
singular operators forced us to impose this assumption. This is especially strange because 
we use "nonhomogeneous" technique, which is supposed to smoothen up all degeneracies of 
the measures. And it does. But so far only for one weight problems. (The recent paper [2S] 
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of 2010 shows this for two weight situation completely.) 

In the present paper we do not impose any doubling condition on measures. But instead 
of getting the criteria (the necessary and sufficient condition) of boundedness of two weight 
Hilbert transform we get the criteria of the two weight boundedness for the family of opera- 
tors, one operator in this family is indeed our two weight Hilbert transform. But the family 
consists of three operators. Let us write the other two of them. They are standard maximal 
operators. 

M^f(x) := sup t-tt / \f\dfi, M u g(x) := sup -— \ \g\dv. 
r.xei I 1 1 J i l:xei I 1 1 Ji 

By the works of E. Sawyer [H], [13] it is known when M M is a bounded operator from L 2 (/j,) 
to l?[y). This happens if and only if the uniform bound on test functions holds: 

WM^XiWl < C Mt i(I), V interval /. (2.11) 

The symmetric condition (with exchanging ii and v) is necessary and sufficient for the 
boundedness of M v : 

WvXi\\ < C M V interval I . (2.12) 

Of course, the following two conditions are both necessary for the operator to be bounded 
from L 2 (fi) to L 2 (u). 

\\H, X i\\h M <C x v(I), V/CR. (2.13) 

\\H»Xi\\hto <Cx»(I)i VICE. (2.14) 

They are the analogs of these Sawyer's conditions, but applied to a singular operator. 

One more condition will be important to us. (It is necessary too!) Let us recall that we 
introduced Q^ u in (12.81) . Its finiteness is necessary for the boundedness of the corresponding 
two weight Hilbert transform. We will see this soon. But actually there is a slightly larger 
quantity, more convenient for us. Its finiteness is necessary for the boundedness of the 
corresponding two weight Hilbert transform as well. Let us introduce it. Recall that Poisson 
extension of measure supported by R is given by the formula 

1 f %z 

Put 

PQ^ U := sup P„{z)P v {z) . (2.15) 

ZGC+ 

It is easy to see that there exists an absolute constant A such that for any pair of positive 
measures 

Q^<APQ^. (2.16) 

Theorem 2.1. Let be arbitrary positive measures. Let H^,H U be bounded on charac- 
teristic functions, namely 

H^tolU'M < C x v{I), VI CM. (2.17) 
\\H„xi\\i*M < C x fi(I), V/cl. (2.18) 
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Let also 

PQw = sup P^{z)P v {z) < C p . (2.19) 

Z6C+ 

And finally let M^, M v be also bounded on characteristic functions as written in ( 12. lip . 
(I2.12p . Then the family H^, M M , M v consists of operators bounded by constant C < oo, 
which depends only on C X ,C P and constants in ( 12. lip , fl 2 . 1 2 j) . 

We can call it "Sawyer's theorem for the family consisting of Hilbert transform and 
Maximal operators" . Or it can be viewed as two weight version of David- Journe's Tl theorem 
for the for the family consisting of Hilbert transform and Maximal operators. 

In fact, the theorem says that the family of three operators M M , M v (or if you wish 
of four operators M M , H u , M u is bounded if and only if the Sawyer's conditions of testing 
on characteristic functions are satisfied uniformly for the operators in the family. In this 
respect it reminds the main result of [3E], where the family of operators was infinite (all 
Martingale Transforms). This is exactly what David- Journe's Tl theorem says for Lebesgue 
measure, Christ's Tl theorem says that for an arbitrary doubling measure (but one measure, 
not two), and nonhomogeneous Tl theorem from [31] says the same for an arbitrary measure 
(but again one measure, not two). 

In view of Sawyer's theorem (see |Sj, [IS]), we can see that this result is equivalent to 
the following one. 

Theorem 2.2. Let fi, v be arbitrary positive measures. Assume that M M is bounded from 
L 2 (n) to L?{y) and that that M v is bounded from L 2 {y) to L 2 (fi). Then H^is bounded from 
L 2 (fj J ) to L 2 {y) if and only if it is be bounded on characteristic functions, namely 

||^X/IU 2 M < C x v(I), V/Cl, (2.20) 

< C xf i(I), V/Cl, (2.21) 

and also 

PQw = sup P»(z)P„(z) < C p (2.22) 

2GC + 

is satisfied. 

We believe that the assumptions of the boundedness of Maximal operators is superflous 
in Theorem 12.21 



3. Necessity in the Main Theorem 

Assumptions (I2.20p . (12.211) are obviously necessary. As to (I2.22|) it is necessary as well. In 
fact, let us consider (just for the sake of convenience) our measures /x, v on the unit circle T 
(instead of being on the line). 
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As in f)2.6p . we define the two weight Hilbert transform on the circle as follows. Let /x, v 
be two positive and finite measures on T. We define the Hilbert transform from L 2 (n) 
to L 2 {y) as any bounded linear operator horn L 2 (/i) to L 2 {y) such that 

H,f(x):=±- [ J®dn{Q, VxGT\supp(/). (3.1) 
2tt J T 1- Qz 

We recall that the Poisson integral of the measure on T is given by 

1 f 1 I 1 2 

P^a):=— / ~ M du.(z), aeB. (3.2) 
27T y T 1 1 — azy 

In what follows we always consider only the measures without atoms. Here is the expla- 
nation. We want to get the necessity of (12.221) . Suppose /i, v are both delta measures at the 
same point. But we adopted such a definition of H^, which allows for its non- uniqueness. 
Two may differ by the bounded operator from L 2 {jj) to L 2 {y) that preserves the support 
of a function, that is by the operator of multiplication. In particular, in the case when 
[i = v — Si, we can see that identity operator is also if M . But PQ^ V = oo obviously. 

For a point a G D put b a (z) := jE^- This is a Blaschke factor, it is a unimodular 
function on the circle. So the operator M ba of multiplication on b a is an isometry in any 
L 2 (cr), supper C T. Given a bounded operator : L 2 (/i) — >■ L 2 {y), consider a new operator 
given by 

T^ a :=H,-M b - a H,M ba . 

Then (13. ip implies that 



WW = V- I 1 baiC)ba(z) f(() du.((), VzeT\ supp(/) . (3.3) 
An easy computation shows 



vc , 2eT ,v aeD , i-MOW i-H (3 . 4) 



In particular the kernel in H3 .31) is bounded. 

Let us present the idea of the rest of the proof. The norm of such an operator (as an 
operator from L 2 {fi) to L 2 (u)) should be of course just ||A;a||^||A; a ||y, which is obviously (see 
{P^a)P u {a)) 1 ' 2 . On the other hand 

\KA = \Wp ~ M b -H^M ba \\ < 2||#J , (3.5) 

as multiplications on b a , b a are isometries in L 2 (fj,), L 2 (v). 
Combining with (13. 5p one gets 

{P,{a)P v {a)) 1/2 < 2||#J • (3-6) 



So we would get (l2T22"j) . 
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The problem with the "proof above is that the operator T a with the kernel 

1- |g| 2 
(1 -aC)(l -azY 

and the operator T fta may be different. In fact, (13 .3p says only that 

(Tfj,, a f, g)v = {T a f, g)u, V/ EL 2 (fi),geL 2 (v), supp(/)nsupp(<?)=0. 

From this alone we do not get (13. 6p . but we get only its weaker version: 

(P, lE (a)P u]F (a)) 1 / 2 <2\\Hj, V£, F C T, £? n F = . (3.7) 

We are left to explain why (13. 7)) implies (13. 6p . They are both Mobius invariant, and so 
let a = 0. As has no atoms we can choose E% to be a half-circle such that fi(Ei) = |/i(T). 
Let E 2 = T \ Ei. Call F such an i?j that has larger v measure. The other one is called E. 
For example, if u(E{) > v{E<2), we have F = Ei,E = E 2 . Then of course, P M (0)P^(0) < 
4P A4 | B (0)P I/ |jr(0). And (GTS) with constant 2 replaced by 4 follows from ( I3T7D . 



4. TWO WEIGHT HlLBERT TRANSFORM. THE BEGINNING OF 
THE PROOF OF THE MAIN THEOREM 

In what follows we use Nazarov-Treil-Volberg preprint [37]. F. Nazarov also noticed that 
what follows can be used for a wide class of Calderon-Zygmund operators and not only 
for the Hilbert Transform. But here we consider one and the only operator-the Hilbert 
transform. The full criterion for the two weight boundedness of "short range" Calderon- 
Zygmund operators (for example Martingale Transforms, Dyadic Shifts, and such...) can be 
found in [36]. In that paper no assumption on measures or any other extra assumption is 
used. 

Let / G L 2 ({i),g G L 2 (v) be two test functions. We can think without the loss of 
generality that they have the compact support. Then let us think that their support is in 
[t, |]. Let V 1 , T> v be two dyadic lattices of K. We can think that they are both shifts of the 
same standard dyadic lattice V, such that [0, 1] G V, and that D M = V + U)\, V v = V + oj 2 , 
where U\,U2 G [—4, 4]. We have a natural probability space of pairs of such dyadic lattices: 

n:={K^)e[-i i] a } 

provided with probability P which is equal to normalized Lebesgue measure on [— |, |] 2 . We 
called these two independent dyadic lattices V^, V v because they will be used to decompose 
/ G L 2 (fi),g G L 2 (y) correspondingly. This will be exactly the same type of decomposition 
as in the "nonhomogeneous 7T" theorems we met [31]- [35]. We use the notion of weighted 
Haar functions hj,h^, and the notion of operators Aq, Aq. 
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Let us recall that hj denotes the Haar function (supported by the interval / G £> M ) with 
respect to measure /i. In other words, it has two values (one on the left half J_ of /, and 
one on the right half J + of /) such that 



hj dfi = , 
(/^) 2 d/i = 1. 



n 

The formula is 



///(/_) y/2 ///(/+) \ 1/2 



X/- - TFT X/- 



The same is for foj with D v replacing X^. We introduce the familiar operators A^ 1 , Aj. 
/ G L 2 (fi),g G £ 2 (z^) be two test functions as above. Then 

Aj?(/) := (f,h^,IeV^, \I\<1. 

A v j(g) := fahfi^IeW, |/|<1. 

Also, let 7q denote the interval of P M of length 1 containing supp(/), the same about Iq 
changing / to g and fi to v. 

A"(/) :=(/ fdu) Xl „ A»{g):={t gdv) Xl ». 
J IS hi 

It is easy to see that functions A M (/), A^ (/), / G £> M are all pairwise orthogonal with 
respect to the scalar product (•, of £ 2 (/i). The same is true for A u (f), Aj(/), / G 
with respect to the scalar product (-, •)„ of L 2 (u). Actually, it is easy to see that the family 
Xi», hj, I C Iq is dense in the set of functions from L 2 (/i) supported by [|, |]. The same is 
true if we replace \i by v. Thus, 

f = A»(f)+ Yl A i(f)i \\f\\l = \\^(f)\\l+ E ii A /(/)iiJ- 

Similarly, 

9 = A»(g) + £ A ^)' ll»ll' = 11^07)112+ E ||A?07)||J. (4.2) 

iewjciS lev»,icl% 

These decompositions and the assumptions (I2.20p . (l2.2ip imply in a very easy fashion 
that we can consider only the case 

A"(/)=0, A v (g) = 0. (4.3) 

In fact, (H ll f,g) u = (H^f - A^(f),g) u + (f^ f dn^H^Xi*), g) v , and the second term is 
bounded by C(C X ) \\f\\M\„ trivially by fl2T2U|) . Using fl2T2T|) one can get rid of A"(g) as 
well. 
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So we always work under the assumption (14.31) . Now, for simplicity, we think that f,g 
are real valued. Then 

(H,f, g) v = (/' hU M9, h») u . 

i£W,JeV v 

4.1. Bad and good parts of / and g 

We use "good-bad" decomposition of test functions /, g exactly as this has been done in 
[31], [33] [35]. Consider two fixed lattices T>^, T> v (so we fixed a point in Q, see the notations 
above). 

We call the interval I £ T>^ bad if there exists J £ T> u such that 

\J\ > |/|, dist(e(J),J) < |J| 3/4 |/| 1/4 - (4.4) 

Here e( J) := dJ U mid point of J. Similarly one defines bad intervals J £ T> v . 

Definition. We fix a large integer r = C(C X , Cd) to be chosen later, and we say that / £ V 1 
is essentially bad if there exists J £ V u satisfying (14.41) such that it is much longer than J, 
namely, |J| > 2 r |J|. 

If the interval is not essentially bad, it is called good. 

Now 

/ = fbad + fgood, fbad '■= A^/ . (4.5) 

IdT>^ , I \s essentially bad 

The same type of decomposition is used for g: 

g = gbad + ggood, gbad-= ^i9- ( 4 - 6 ) 

J EC, J is essentially bad 

4.2. Estimates on good functions 

We refer the reader to [31], [33]-[35] for the detailed explanation that it is enough to estimate 
\(H tl fg 00 d,9good)u\, because 

{Hfifi g)u = {H^fgood, 9good)u + (H^f bad , g goo d)u + [H^f, g ba d) (4.7) 

We repeat here sketchingly the reasoning of [21], [33]-[35]. In [31], [33J-[3S] we proved the 
result that the mathematical expectation of ||/b a dlU? ||flw||i/|| is small if r is large. In fact, 
the proof of this fact is based on the observation that 

P{(wi, w 2 ) £ : / is essentially bad 1 1 £ V^} < r(r) -»■ 0, r ->■ oo . (4.8) 

So we consider the following result as already proved. 
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Theorem 4.1. We consider the decomposition of f to bad and good part, and take a bad 
part of it for every u = (wi,u; 2 ) G Q. Let E denote the expectation with respect to (fi, P). 
Then 

Efll/tadU < £(r)||/ |U where e(r) -> 0, r -> oo . (4.9) 

T/ie same u>zt/i g: 

Hhbadh) < e{r)\\g\\ u , where e{r) -> 0, r oo . (4.10) 
Coming back to (14. 7p we get 

\(Hnf,g)u\ < \(H n f good, g good) v\ + \\H^\\ ||/6ad||,u||<?good||i; + II Hp II II / IU II 

Qbad \\i> 

\(HJ good ,ggood)v\ + 2C , £:(r)||/|| M ||5f|| ;/ , 

where C denotes Ili^U 2 ^)-^ 2 ^) (a priori finite, see Section [2]). Choosing r to be such that 
Ce{r) < |, choosing f,g to makeK-f/^/, g)y| to almost attain C||/|| M ||(7||j,, and taking the 
mathematical expectation, we get 

2^11/IUMI^ - ^{H^f good, 9 good) u\ 

for these special /, g. If we manage to prove that for all f, g (see the notations for C x , C p 
in Theorem 12. 1 p 

\(H,fgood,g 9 oodU < C(C d ,C x ,C p )\\fUg\\ u V/ G L 2 (fi),\fg G L\v), (4.11) 
then we obtain 

\\H^\\l 2 (^l 2 (u) = C < 2C(Cd, C x , Cp) , 

which finishes the proof of Theorem 12.11 
The rest is devoted to the proof of (14. lip . 



5. First reduction of the estimate on good 
functions (14.111 ). Long range interaction 

So let lattices V^,V U be fixed, and let f,g be two good functions with respect to these 
lattices. Boundedness on characteristic functions declared in (I2.20p . (I2.2ip obviously imply 

\{H^if,^ jg ) u \ < C x \\^fU/Y jg \\ u . (5.1) 

Therefore, in the sum (H tl f,g) u = ^ /gX>M JgX)I/ (iJ At Aj, Aj^)^ the "diagonal" part can be 
easily estimated. Namely, (below r is the number involved in the definition of good functions 
in the previous section, and we always have I G T>^, J G T> u without mentioning this): 

\(H^f,A^g) u \ < C(r,C x )\\fUg\U- (5.2) 

2~ r |J|<|-f|<2 r |J|,dist(/,J)<max(|/|,|J|) 
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Let us consider the sums 

Si:= £ |(^/,A^)„|. (5.3) 

2- r |J]<|/|<|J|,dist(/,J)> \J\ 

S 2 := E K^/.A^I. (5.4) 

2~ r |-f|<|J|<|/|,dist(/,J)>|/| 

They can be estimated in a symmetric fashion. So we will only deal with the first one. 
Lemma 5.1. Let \I\ < \J\, dist(J, J) > \ J\. Then 

\(H^f,A»gU < A (djst(/> |7| - |J|)2 M/) 1/2 K^) 1/2 ||A^|U||A^]U. (5.5) 

Proof. Let c be the center of /. We use the fact that f Ajf d\i — to write 

M/,A^= /dM<) / di/(a)-^A?/(t)A^( S ) = /dA*(t) / dv{s){-± — )Ajf/(t)A^(a) 

J/ J J t — S J/ J J i — s c — s 

Then one can easily see that 

\(H»A?f,A« jg U <A JJ^r^\^fm^j9{s)W{t)dv{s). (5.6) 

Now we estimate the kernel ^i^z^ Xi(t)Xj(s) < A ^(i^yl\I\+\j[^ usin S that l J l ^ l J l> 
dist(J, J) > \J\. On the other hand 

IK/|| LV) < Ki) 1/a \Mf\U, l|A^|| ilM <M^) 1/2 l|A^L. 

And the lemma is proved. □ 

Let us notice that Lemma 15.11 allows us to write the following estimate for the sum of 
Q53J (as usual I eV»,J eD"): 

OO I T| 

^&;J_, |j|Ma ^M/)^)'lA?/IJA a |, (5.7) 

Or 

Si<E 2 " n E £ fdistf/ n.^ MO^^iiAy/iuiiA^t. (s.s) 

n=0 fceZ I,J:|/|=2-»+ fc ,| J|=2 fc 1 ^ ' ' ' 

To estimate "the n, k" slice 

s^* := £ (dista 2 j) + ^^ ffl^K^^llA^tiiA^ii. 
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let us introduce the notations. 

^ [jAg/jjg ||A^||, 

IeV^,\I\=2~ n + k PV 7 Jgl»',|/|=2 fc v ; 

Also 



K y {t,s):=^—£ -, y>0, t,seR. 

y z + \t — s\ 2 

Then 

S„, fc < / dfi(t) [ dv{s)K 2k {t,s) V {t)vb{s). (5.9) 



Lemma 5.2. T/ie integral operator f — > J K y (t, s)(p(t) dfi(t) is bounded from L 2 (/j,) to L 2 (u) 
if Qn,v (recall that this quantity is equal to sVip IcR (fi)i(h')i) is bounded. Its norm is bounded 
byAQj/,1 

Let us postpone the proof of this lemma, and let us finish the estimate of Ex using it. 
First of all the lemma gives the following estimate (notice that < AC P ). 

z n , k <c(c p )MMi = c(c p )( H A //IP 1/2 ( E W a j9\\D 1/2 - 

lew, \i\=2~ n + k Jev,\j\=2 k 

By Cauchy inequality 

E S -*<E( E ii a ^ii') 1/2 ( E \Mf\0 1/2 < 

k k JeV v ,\J\=2 k l£T>v,\I\=2- n + k 

(En A ^ii 2 ) 1/2 (Eii A ^iiy i/2 ^ii/ti^^ 

JeP" lev* 
by d3U). Then §M> gives S x < 2~" £ fc S n , fe , and so 

oo 

Si < C(CA E 2 ~l/IUNk = 2C(CA||/y< ? || 1/ , 

n=0 

and our long range interaction sum S x is finally estimated. 
Proof of Lemma 15.21 

Let us consider several other averaging operators. One of them is 

Iip(s) := J X[-ll]{s ~ t)(p(t) dri(t) . 
Another is as follows: let G be all intervals £ k of the type [2k, 2k + 2], k G Z. Consider 
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Consider also shifted grid G(x) = G + x, x G [0, 2), and corresponding Aqi x \. 
Notice that 

If{s) < a I A G{x) Lp(s) dx . (5.10) 
Jo 

In fact, consider [0, 2], \dx as an obvious probability space of all grids G{x). Then it is easy 
to see that for every s the unit interval [s — + §] is (with probability at least 1/2) a 
subinterval of one of the intervals of G(x). Then the above inequality becomes obvious (and 
a = 4). 

On the other hand, the norm of operator Aq as an operator from L 2 (fi) to L 2 (u) is 
bounded by 2Q]I 2 V . In fact, if 4 = [2k, 2k + 2], then 

\\AMl < J3( / M dp) 2 v{l k ) <J2([ M 2 dn)v(i h )n(£ k ) < 



2 J., /i/O II JTI12 



\<p\ dji = 4Q Mi 



The same, of course, can be said about HAg^^H^. Then fl5.10p implies that the norm of 
averaging operator / from L 2 (fi) to L 2 (u) is bounded by AQ l J 2 . Let us call by I r the operator 
of the same type as /, but the convolution now will be with the normalized characteristic 
function of the interval [— r, r\. 

I r (p(s) := — / X[-r,r](s - t)(p(t) dfJt(t) . 



2r _ 

It is obvious that the reasoning above can be repeated without any change and we get 

lllMl^AQ^llflll (5-11) 

To finish with the operator given by / — » f K y (t, s)ip(t) d/i(t) as an operator from I/ 2 (/x) 
to L 2 {y\ let us notice that (and this is a standard inequality for the Poisson kernel) 



K y (t, 8 )\<p{t)\ d^t) < A J2^ k (IyM)(s) . 



k=0 



Now Lemma 15.21 follows immediately from (15. lip and the last inequality. 



6. The rest of the long range interaction 

As always all Ps below are in T>^, all J's below are in D v . Consider now the following two 
sums. 

E \{H^if^ V j9)A- (6.1) 
\I\<2- r \J\,lnJ=Q 



6. The rest of the long range interaction 
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£ \(H^f,A^gU. (6.2) 

|J|<2-''|/|,/nJ=0 

They can be estimated in a symmetric fashion. So we will only deal with the first one. 
Notice that /, g are good functions. These means, in particular, that /, J, which we meet 
in (16.1 p satisfy 

dist(/,a/) > |J| 3/4 |/| 1/4 . (6.3) 

. This is just (I4.4p for disjoint /, J with I not essentially bad (see the definition at the 
beginning of Subsection 14. ip . 

Lemma 6.1. Let I, J be disjoint, \I\ < 2~ r |J|, and satisfy (16. 3p . Then 

\(H,Alf,A»g) u \ < A (dis J^I| + ^ ( J ) 1/2 ^ J ) 1/2 H A //IImII A ^II^ (M 

Proof. If dist (/, J) > | J | . this has been already proved in Lemma I5TT1 So let dist (J, J) < |J|, 
/, J being disjoint. Repeating (15. 6 p one gets 



s)\dfi(t)dv(s) . 



Now we estimate the kernel Tr^pXi{t)Xj( s ) < A dis J/L n2 - Therefore, 



\t- s \ 2 Ai\"JAJ\ a J 1 * dist(/,3J) 2 

dist(/,9J) 2/ 



(W A J9U < A -=- ' J ' n2 M/) 1/2 ^) 1/2 ||A^tl|A^L. (6.5) 



We use (16. 3 p to write 



l/l l/l 1/2 I/I 1/2 UI 1/2 

< J_L_ = 1 1 . i_J — < A 







l/2|j|l/2 


(dist (J, J) + 


J] 





dist(/,<9J) 2 - |J| 3 / 2 |J| 

because we assumed dist (J, J) < \J\ and 7 is shorter than J. This inequality and (16. 5 p finish 
the proof of the lemma. □ 

Let us notice that Lemma 16.11 allows to write the following estimate for the sum o\ from 
(EU): 



oo 



^5>" /2 E piM^Tw ^^-'-|A;/iiiAai.. <«..) 

n=0 /,J:|f|=2-»|J| V V ' ' 11 1 17 

Or 

-i<E 2W2 E E fd5t?77TTW <f(/) ' /V(J)1/2||A?/ll '' l|A ^ 11 - (6 ' 7) 

n=0 k&L I,J:\I\=2- n + k ,\J\=2 k ^ V ' / / 
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To estimate "the n, k" slice 



** : = E (dM(I ,^ +2 >), Mi)^(.0^l|Ail/|U|A M , 

/,J:|/|=2-"+MJ|=2 fc V V ' ' 1 

let us use again the notations 

*<*>= E S^).*)= E S^w- 

7eDM l |/|=2-™+ fc ^ V ; JeD",|/|=2 fe v ; 

Also 

if»(t,s):=-r- 1 [J, 2/>0, Mel. 

Then 

Cn.fc < / <Mt) / du(s)K 2 k(t,s)(p(t)ip(s) . (6. 



Lemma I5.2I now gives as before the estimate of <7\. First of all the lemma gives the 
following estimate (notice that Q^ v < AC P ). 

a n , k <C(C p )MM\ u = C(C p )( J2 W A if\& /2 ( £ W A J9\\I) 1/2 - 

iev^,\i\=2- n + k JeV",\j\=2 k 

By Cauchy inequality 

£*»*<£( £ iia^ii^ £ iiA^/ii;) 1 / 2 < 

fc k Jev,\j\=2 k iev^,\i\=2- n + k 

(Eiia^ii^) i/2 (Eii a //iiJ) 1/2 <ii/iuii^l 

by gU). Then JEZJ gives a x < En=o 2 ~ n/2 E k °n,k, and so 

oo 

*i < C(C P ) J2 2 ~ n/2 WfU9\\» = A C(C P ) \\fU\g\\u, 

n=0 

and our long range interaction sum <j\ is finally estimated. Symmetric estimate holds for o~ 2 
from JO) . 



Conclusion: if /, g are good, then the sum of all terms {(H^Ajf, Ajg) u \ such that either 
J4 G [2- r , 2 r ] or 7 n J = has the correct estimate C{C P ) \\f\\M\y 



7. The short range interaction. Corona 

decomposition. 

As always all Ps below are in all J's below are in D v . 



7. The short range interaction. Corona decomposition. 
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Let us consider the sums 

p:= E (A£/,ff„Afc)„. (7.1) 

|/|<2-''|J|,/CJ,dist(/,e(J))>|J|3/4|/|i/4 

E (HA$fAj9)u- (7.2) 

|J|<2-^|/|,JC/,J6X) I/ ,Ji8 good 

They can be estimated in a symmetric fashion. So we will only deal with, say, the second 
one. It is very important that unlike the sums £j, <7j, this sum does not have absolute value 
on each term. 

Consider each term of r and split it to three terms. To do this, let Jj denote the half of 
J, which contains J. And I n is another half. Let J denote an arbitrary super interval of Jj 
in the same lattice: I G X"\ 

We write 

(HAH AM* = {H,{x In ^f)A v jg) v + {HfaAWtXjg)* = 

(H,( X i, AID, Xj9)v + (Ajf/UWx/), ^jg)u - Mf)*i<(Hv(xi\i t ),Xj9)v. 

Here (Aj/)^ is the average of A^f with respect to /i over Jj, which is the same as value of 
this function on Jj (by construction Ajf assumes on I two values, one on Jj, one on J n ). 

Definition. We call them as follows: the first one is "the neighbor-term" , the second one is 
"the difficult term" , the third one is "the stopping term" . 

Notice that it may happen that J = Jj. Then stopping term is zero. 

7.1. The estimate of neighbor-terms 

We have the same estimate as in Lemma [6.11 



[HpixiAWAM^A 



Jl 1 / 2 !/! 1 / 2 

n m , ^^ n^nXiA^UAUl- (7-3) 



(dist(J, J) + |J| + |J|) 2 



Of course, \\xi n A^f\\^ < \\A^f\\^. So the estimate of the sum of absolute values of neighbor- 
terms is exactly the same as the estimate of <j\ in the preceding section. 

7.2. The estimate of stopping terms 

Here the fact that we deal with the Hilbert transform will be used in a very essential way. 
The estimate for other Calderon-Zygmund kernels will definitely require some new tricks. 
We need the following definition. 

Definition. Given an interval J = [a, b] and any measure da on the real line, we write 

P ^ d " = \ L (6-a)» + ((6 + «)/2- ( )» dait) ' 
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This is the Poisson integral at the point whose real part is the center of the interval, and 
imaginary part is the length of the interval. 

We want to estimate 

l(A^UH(^(x /V ),A^),|. 

First of all, obviously 

l/A' 1 A I <r II Aj/IU 

l(A//UI<^7jr^- 

Secondly, 

\(H,( Xiv ),A» jg U = \{ Xi ^H v ^ j9 ),\ < 



This is the usual trick with subtraction of the kernel, it uses the fact that J Ajg dv = 0. We 
continue by denoting the center of Ii by c 

a wwm.J n d J ( J J, c)2 <w . 

where e(J) is two end and te center of /. The elementary inequality above uses of course the 
specific nature of the Hilbert transform. We continue, using the definition above, 

Thus 

l(^(X/ u ),A^),| < AKJ) 1/2 ||A^L(^) 1/2 P 71 ( XA/ ^). (7.4) 
We now get the estimate of the stopping term: 

mf)»MHMi\il Wj9)A < A (^^(^^(x^^llA^LIIA^t. (7.5) 

7.3. Pivotal property, which might turn out to be a necessary con- 
dition for the two weight boundedness of the Hilbert transform 

Let / G X> M . Let {I a be a finite family of disjoint subintervals of / belonging to the same 
lattice. We call the following property pivotal property: 



7. The short range interaction. Corona decomposition. 
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[PlafalMMla) < P ■ (7-6) 



Notice that we always assume P fl (z)P u (z) uniformly bounded. (This property f 1 2 . 2 2 j) is 
necessary for the two weight boundedness of the Hilbert transform.) In view of this, one can 
replace our pivotal property by an equivalent one (may be with a different constant P): 

J2lPi a (xid»)MQ < P til) ■ (7-7) 

a 

Now properties (17. 6p (or equivalently) (17. 7p are the only things we need to prove that the 
Hilbert transform is two weight bounded if and only if P^(z)P u (z) is uniformly bounded 
and test conditions (I2.20p . (12.211) of Sawyer's written down in Theorem 12. 11 

In other words properties (17.61) (or equivalently) (17.71) are the only things we need to 
prove our two weight Tl theorem. 

We want to emphasize that actually we do not need extra assumptions on doubling (as 
in [37]) or an extra assumption on the boundedness of maximal operators M^, M nu as in this 
paper's Theorems 12.11 12.21 

We only need properties (17.61) (or equivalently) (17. 7p . Of course with a symmetric coun- 
terpart, where places of \i and v are exchanged. They can be necessary for the boundedness of 
H^l If so we are done completely — a two weight Tl theorem is obtained with no restrictions 
whatsoever. But we cannot either prove or disprove the necessity of (17. 6p (or equivalently) 
(17.71) for the boundedness : L 2 {mu) — > L 2 {y). 

Remark. What we know is that uniform boundedness of P ll (z)P u (z) alone does not imply 
(17.61) . This can be understood with the use of Bellman function method and this will be 
discussed in the last section of this paper. 



However, the extra condition on doubling imposed in [37] allowed us to deduce (17. 6p from 



the boundedness : L 2 {mu) — > L 2 {y). 

Also now we will show that (17.61) follows trivially from the assumption of boundedness 
M M : L 2 (mu) — > L 2 [y). This is our extra assumption in Theorems 12.11 |2T2"1 The symmetric 
counterpart of (17.61) . where places of /i and v are exchanged, follows from the assumption of 
boundedness M v : L 2 {nu) — > L 2 (fi). 

Lemma 7.1. Let M M : L 2 {mu) — > L 2 (u) be bounded. Then (17. 7p holds with constant K = 
AHM^II 2 , where A is an absolute constant. 

Proof. It is a standard estimate of the Poisson integral via the maximal function (see, for 
example, [IS]), which gives 

Pi a (Xid f j)<Aw£(M l t X i)(x). (7.8) 

Then (17. 8p implies 

Y^Pla(Xldfi) 2 p(Q < 
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.4 



{M^xi){xf du{x) < A {M^ X i){x) 2 du{x) < A\\M£ ^{1) 



□ 



7.4. The choice of stopping intervals 

Let K be a large constant to be chosen later. Fix an interval / G T>^. Let us call its 
subinterval I e T>^ a stopping interval if it is the first one (by going from bigger ones to the 
smaller ones by inclusion) such that 



P/(XzV^)J v{I)>Kn(l), 2 = 1,2. 
Here is the place, where we use the pivotal properties (17. 6p : 



(7.9) 



Theorem 7.2. If /z, v are arbitrary positive measures such that (\7.6\i is satisfied, then for 
every I e T>^ 

< i/*(J) , (7.10) 

/Gl" 1 , Icl , Us maximal stopping 

provided that the constant K in the stopping criterion (17. 9p is large enough. 

Proof. In fact, let {I a } be a family of maximal stopping intervals inside J according to 
stopping criteria just introduced in (17. 9p . Then 



KQ < g p i a (Xi\i a dn) u{I a ) . 



Intervals {I a } are disjoint subintervals of /, and so (17. 6ft is used now: 



if K > IP. 



□ 



Definitions. 1. For any dyadic interval J, F(I) will denote its father. 

2. The tree distance between the dyadic intervals of the same lattice will be denoted by 
t(h,I 2 ). Of course t{I,F{I)) = 1. 

3. Stopping intervals of the same lattice will also form a tree. We will call it S. The tree 
distance inside S will be denoted by r(Si, S 2 ). Of course 



r(S 1 ,S 2 )<t(S 1 ,S 2 ). 



(7.11) 



7. The short range interaction. Corona decomposition. 
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7.5. Stopping tree 

In Section [7] we introduced the sum, which we are left to estimate: 

E WfA v j9)u. (7.12) 

|J|<2- r |/|, J Cl, J 'eVj,,J is good 

Each term of r was decomposed into three terms. We recall: let Jj denote the half of /, 
which contains J. And J n is another half. Let I denote an arbitrary superinterval of ij in 
the same lattice: I G T>^. 

For a given / G V^, J G I, J E V u , J good, we write down the following splitting 

(H,A?f, A" j9 ) v = {H,{ Xln ^f)A v j9) v + (ff„(x A Aj?/), Afr)„ = 

{H^^fi^Wu + Mf)^ (7.13) 

Here (Ajf)^. is the average of Ajf with respect to /j, over Jj, which is the same as value of 
this function on Jj (by construction Ajf assumes on I two values, one on J,, one on I n ). 

We called them as follows: the first one is "the neighbor-term", the second one is "the 
difficult term" , the third one is "the stopping term" . 

In what follows it is convenient to think that we consider our problem on the circle T 
rather than on the line. We want to explain how to choose J in a stopping terms above. 

Construction of the stopping tree S. We choose first I = T (this is why the circle is more 
convenient, we have the first "hat" interval). The choose its maximal stopping subintervals 
{/}. Just use the criterion (17.91) from Subsection 17.41 Call each of these Fs by the name 
S. In each S again find its maximal stopping subintervals {S}. Et cetera... . All intervals, 
which were thus built, we call "stopping intervals". They have their generation. Stopping 
intervals, as a rule, will be denoted by symbols with "hats". 

To explain the choice of I in a stopping terms above we need the notations. 

Notations. If S G T>^ is a stopping interval, and S = {S},S G T>^ is a collection of its 
maximal stopping subintervals (we call them stopping suns of S, there stopping tree distance 
to S is one: r(S, S) — 1), we call Ob the collection of all intervals I from both lattices V 1 , V, 
such that the top side of the square Qi lies in the set fl§ := (Q§ \ UsesQs)- In particular, 
S G Og, but its stopping suns are not in Og. 

The choice of I in a stopping terms above in (17.131) is as follows: let /, J be as above, 
namely J C /, J G V u , J good, J C Jj, where is a son of /, we choose the first (and 
unique) stopping interval S such that Ij G O s . Then we just put I = S. 

Definition. Recall that the father of an interval I with respect to the tree of all dyadic 
intervals was called F(I). If S G S, then its father with respect to tree S will be always 
called from now on S. 

Let us introduce the sum of absolute values of the "stopping terms" of the sum r above 
(as always I eV,J eV u ). 
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t-= E l(A^),,/JI(^(x/ Vl ),A^),l- 

J|<2-''|7|,Jc7",JGX>^,Jis good 

To estimate it we can use (17. 5p . Then (recall that Ij is the half of / containing J) 

t < AT, T :— (^) V2 (^) 1/2 ^(x /Vl ^)l|A^||,||A^|U. 

|J|<2"''|/|,JC/,dist(J,e(/))>|7| 3 / 4 |J| 1 /4 ^ % > 

We will follow the steps of [31] (and we will use the stopping criterion (17. 9p based on 
constant K) to prove the following theorem. 

Theorem 7.3. 

T<C{K)\\f\l\\g\\»- 

Proof. Put 

E (^y) 1/2 ^(x /Vl ^)l|A^L||A^||,. 

|J|<2~''|/|,JC/i,|/|=2 fc ,|J|=2-"+'= l ' 

Then abusing slightly the notations we denote the halves of / by I\,I<z- We get 




Consider only Jj. By the Cauchy inequality the estimate will be 




The middle term is bounded by [Piiixh^ dfi)] 2 v(Ii) / fJ,(Ii) ■ By ( jT.QIt we get that the middle 
term is bounded by K. In fact, this was our choice of J, which ensures that I G Of, and so 
(ESD holds. 

Thus, the last expression above is bounded by (this is just the Cauchy inequality) 

k E ii A "f\u E ii A ^\\l) 1/2 < k( E \Mf& /2 ( E E ii a j9\\d i/2 ■ 

\I\=2 k Jch,\J\=2- n + k \I\=2 k \I\=2 k JCh, \J\=2~ n + k 

As a result we get the estimate on r n fc : 

r n , fe < ax) ( e \m\\D 1/2 ( E ii a j9\\d i/2 ■ 

\I\=2 k \J\=2~ n + k 

Now it is obvious from the formulae for T and r n ^ that 

^<E 2 ~ n/2 E^- 

n k 

But from the estimate above and the Cauchy inequality Ylik r n,k < C(K) \\f\\fj,\\g\\u- So we 
get Theorem 17.31 

□ 



8. Difficult terms and several paraproducts 
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8. Difficult terms and several paraproducts 

Let us recall /, g are good functions and that in the sum 

E (HfiAff, Ajg) v . (8.1) 

\J\<2- r \I\,JCI,J£V„,Jis good 

we consider each term of r and split it to three terms. To do this, let Jj denote the half of 
/, which contains J. And I n is another half. Let S denote the smallest superinterval of Jj in 
the same lattice: S G V 1 , S G S such that 

h G O s , (8.2) 

where the family of intervals Os was introduced shortly after (17.121) . (In other words S is 
the smallest stopping interval containing Ij.) 

We wrote 

(H,A>}f, W j9 ) v = (H^A'f), Ajg) u + (fl^AJ?/), A»g) v = 

{H,{ Xln A^)A»g) v + {A»f), M {H,{x s )A v j9) v -{AW 

Here S is the smallest interval from the stopping tree S such that Ij G Og- Also h ere 
(Ajf)^ is the average of Ajf with respect to fi over Ij, which is the same as value of this 
function on J, (by construction A^f assumes on / two values, one on I iy one on I n ). 

The sum of absolute values of the first terms and the sum of absolute values of the third 
terms were already bounded by C\\f\ \^\\g\\ u in the preceding sections. Middle terms were 
called "difficult terms", and we are going to estimate the absolute value of the sum of all 
difficult terms now. This is the most difficult part of the proof. 

Let {Slsgs denote the family of stopping type intervals of all generations (for the con- 
venience we think that we are on the circle T and the first generation consists of the circle 
itself). In what follows the letter S is reserved for the stopping intervals. Recall that S also 
denotes the stopping interval, the father of S inside the stopping tree S. 

Notations. Let S G S be an arbitrary stopping interval. We denote by P^e^ the orthogonal 
projection in L 2 (/z) onto the space generated by {hj}, I G 0$, I is good, and we denote by 
P„,o s the orthogonal projection in L 2 {y) onto the space generated by {hj}, J G Os, J is 
good. (Recall that Os included by definition the intervals in both lattices T>^ and T> v '.) 

We fix I G W, it defines S G S (see flQ] ). we look at terms 

(A»f), A {HMs),A u j9)v 
We can write each of the term (Ajf)^ j.(H^(xs), Ajg) u with fixed S and / G Os, J G Os 

as 

(A^, i0s f),,iAH,(xs),A^ 0s 9),- 
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The definition of t s . We collect all of these terms with I 6 O s , I G £> M , J £ C 5 , 
J G X>", | J| < 2~ r |/|, J is good. The resulting sum is called r s . (In summation below 
we should remember that /, g are good: so we can sum over all pertinent pairs of I, J 
remembering that some of A's are zero anyway.) 

We first fix good J, then summing over such 7's gives (such 7's should contain J, and 
they form a "tower" of nested intervals, from the smallest one called £{J) to the largest one 
equal to S; notice that the summing of quantities (Ajip)^ over such a "tower" results in the 
average over the smallest interval minus the average over the largest interval of the "tower" , 
the latter one being zero in our case) 

(f>,,o s f)» A j){^jH»{ X s)Jv,o s g)v, 

where £{J) e O s ,£{J) G W, \£{J)\ = 2 r ~ 1 \J\. One can argue that replacing / by P M ,o s / 
we make gaps in the tower as (A^/)^/ got replaced by from time to time (for bad 7's 
actually). But this is not a problem as / is good, and so (Ajf)^j is zero anyway for bad 
7's! 

Summing over J we get 

rs= £ (*WW( E A»H»(xs),K,o s g)v. 

i ev», i eO s , i is good JeD ! Vee> s ,|J|=2-'-+ 1 |/|,jis good 

8.1. First paraproduct 

Let us introduce our first paraproduct operator 

iewjeOs Jev» ,Jeo s ,Jci,\J\=2-r+i-\i\,j is goo d 
Then the absolute value of the sum Ts above is 

\(n H ^ s F^ 0s f,F u , 0s g) u \ < C x \\^,o s f\U\K,o s 9h , (8-3) 
where C\ is the norm of t^h^xs as an operator from L 2 (fj,) to L 2 (u). 

Theorem 8.1. The norm of operator tth^xs as an operator from L 2 (fi) to L 2 (v) is bounded 
by Ci(K) < oo, where K is the constant participating in the definition of stopping intervals. 

Proof. Obviously 

iev/*,ieO s 

where $(/) := {J : J eV v , J G O s , J C /, | J\ = 2~ r+1 |7|, dist( J, d(I)) > |7| 3 / 4 |7| 1/4 } 

aj:= J2 W^jHMs)\\l. 
Je*(/) 



8. Difficult terms and several paraproducts 
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The Carleson imbedding theorem (see [TB], and in this context [21]) says that the bound- 
edness of the sum X]/er>M ieo I (v 9 )^,/! 2 a i by C ||y||^ is equivalent to the following Carleson 
condition 

eev^,£eo s ,£ci 

Of course (*(J) := {J : J eV v , J e O s , J C J, |J| < 2~ r+1 |/|, dist( J, <9(/)) > l/l 3 / 4 !/] 1 / 4 }) 



E a,= E ||A^(X5)||^=|| E 



By duality then 



£ a £ = sup | £ (^( X 5),A»,| 2 < 



sup | V (^( X5U ),A^| 2 + ||tf M ( X/ )|| 2 

^££3(^,11^1^=1 J: Jii(i-) 



So (123011 implies 



E a ^ SU P E ( H »(Xs\i),AjipU 2 + C x //(/). (8.5) 

Let us consider the term (H^(xs\i), Ajip) u , J G \I/(/). Exactly this quantity was esti- 
mated in (1711) . We get 

\(H,( Xsv ),AW) u \ <^K^) 1/2 ||A^||,(^[) 1/2 Pi(x S v)^- 
So the first term in 18.51 is bounded by (we use the Cauchy inequality) 

J:Je^(I) ' ' n |J]=2-«|/|,JC/ 

E2- n [P/(X5\/)^] 2 K/) 

n 

as \\i(}\\u = 1- It is time to use the fact that / e C? s , which means that the stopping criterion 
(17. 9p is not yet achieved on /, in other words that 

[Pi{Xs\i)d^ 2 u{I)<K^I). 

Combining this with ( I8.5p we get (|8.4p : 

E a e < (K + C x ) fi(I) . 

And Theorem 18.11 is proved. 

□ 
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Let us recall that we introduced above the definition of t s , for stopping interval S. We 
finished the estimate of the sum of Ts over all stopping S (recall that the set of all, stopping 
intervals was called S): 

J> s < C(K,C x )J2\K,OsfVK,o s 9\l < C(K,C x )\\fUg\\^ (8.6) 
ses ses 

the last inequality following from the orthogonality of P Mi o s / for different S (the same for 
^u,o s 9) an d the Cauchy inequality 

8.2. Two more paraproducts 

In the previous subsection we have estimated a piece of the sum of the difficult terms 

(A»f)^(H»(Xs),^ jg ) u , (8.7) 

namely, we estimated the sum of such terms, when /, J lie both in the same family 0$, 
where S G S (arbitrary stopping interval). Such a sum was called r s , and we just proved in 
(TO that EsesTs < Cjl/IUMU 

What is left is to estimate the sum of abovementioned terms when J G 0$ and I belongs 
to another Os 1 , where S, Si are both stopping intervals. As / is larger than J, we have 
to consider the pairs of stopping intervals, where S is strictly inside S\ (Si is one or more 
generations higher in a stopping tree S than S). 

Let us recall that F(I) denote the father of / inside the standard dyadic tree. Let us 
fix J. Let ... C S3 C S 2 C Si C ... be a (finite) sequence of stopping intervals of successive 
generations containing J. So is a father of Si in the stopping tree S. So it is notv true 
that Si_i = F(Si) in general! 

The sequence for Fs, over which we have to sum up, will be one term shorter (the smallest 
one should be discarded). This is because we sum up all the terms, where J and I are in 
different families Og., 0$^, and Si is inside SV-i- Notice also that (Ajf)^. is the difference 
between two averages of / with respect to /i, one over F and one over its father J. It is easy 
to some up successive differences and summing all above mentioned terms with fixed J we 
get 

••• + (C/%,S 2 - (f)n,F(S 2 ))(H flX S 2 ,Ajg)u + ((/)/i,F(5 3 ) - (f)fi,F(F(S 3 ))){H tM XS 2 ,^j9)u+ 

((/> ftSs -(/u S3 ))fe,A;i+- 

Regrouping, we get 

••• + (/) v,F(S3)( H nXS 2 \S s , &j9)v + ••• 

We have to take into considerations also the terms with the smallest S m for a given J, for 
which there will be no pair. Subsequently, the sum of abovementioned terms in (18.71) . when 
J G Os and I belongs to another Og, where S, S are both stopping intervals, S is strictly 
smaller than S, can be written in the following form. (We denote by S the stopping interval 
containing the stopping S and of the previous generation (the stopping father of S). 



8. Difficult terms and several paraproducts 
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Warning. In all sums we meet below J is always good. We have to add everywhere "J 
is good". For the sake of brevity we do not do that, but we ask the reader to keep this in 
mind. 

This is what we are left to estimate. 

ses JeQs ses JeO s 

We used the notations Qs = U s es,scsO s . This means that the family Qs consists of intervals 
I from our both lattices V 1 , D v such that the top of the square Qe belongs to the square Qs 
(the slight abuse of notations, the square, and the corresponding family of the intervals are 
denoted by the same letter). Recall that J is always good in the above sums. Then we can 
introduce two projections f v ,Q s , ^u,o s - Actually the second one was already introduced. But 
anyway, we denote by ffV,e> s the orthogonal projection in L 2 {y) onto the space generated by 
{hj}, J G Os, J is good. And we denote by f u ,Q s the orthogonal projection in L 2 {y) onto 
the space generated by {hj}, J G Qs, J is good. (Recall that Qs, Os included by definition 
the intervals from both lattices T>^ and T> v .) Now we can write p as follows 

ses ses 



We introduce now two paraproducts: 

ses 

* Q f--=J2(f)*ns)^,Q s (H„Xs\s)- 
ses 

Then p\ = (tt° , g) v , p 2 = (^ Q ,g)u- So to finish the proof of our main Theorem 12.11 it is 
enough to prove the boundedness of these paraproducts as operators from L 2 (p) to L 2 {y). 

To prove the boundedness of the first paraproduct let us use Theorem 17.21 Consider the 
sequence 

{bs}ses, b s := ||P„,o s (ff M Xs)|£. 

It is a Carleson sequence: 

V/GD" J] b s <Cp{I). (8.9) 

sci,ses 

In fact, b s < WH^xsWl < C x p{S) by (T2~20D . Now becomes clear by Theorem El 

Notice that ^v,o s are mutually orthogonal projections in L 2 [y) for different S. This is 
just because the families Os are pairwise disjoint for different S G S. This is exactly what 
helped us to cope with n° f so easily, we just used 

h°f\\l = II 52(f)»sV*.o s {HrXs)\\l = E\(f)^\ 2 \\^,o s (H,Xs)\\l = ^K/)^ • 
ses ses ses 
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This is where the orthogonality has been used. And we applied then the Carleson property 
°f {^sjses- We already saw this type of paraproducts with the property of orthogonality 
(see [16], [31], and especially Theorem 18.11 above). And we know that Carleson condition 
(18. 9 p is sufficient for the paraproduct operator n° to be bounded. 

The second paraproduct n Q . 

The main problem is that P^q s are not mutually orthogonal projections in L 2 [y). 
So ||vr Q /||^ has the diagonal part but also the out of diagonal par: 

h Q if v < DP + ODP, 

where 

DP:=J2\(f),,F(s)\ 2 \K, Qs H,(xs\ s )\\l, 
s&s 

ODP := K/}^(50iK/}^(S)||(P, ) Q s ^(X5\5)> F ^Q s ^M(X^\5')^l = 

S,S'eS,S'cS,S'^S 

E K/)^(soIK/)^f(s)II(p^^(x^),^,o 5 ,^(x5 V O^I- 

S,S'eS,S'cS,S'^S 

We start with ODP. Recall that r = r(S", 5") is the generation gap between 5" and S, S' C S 
in the stopping tree S. 

ODP < J2 \(f)^s)\ 2 \K, Qsl H,( x§xs )\\l ■ (l+eY^+ 

S,S'eS,S'cS,S'^S 

E \(f),,F(son^,Q s ,MXs'\ S ')\\l ■ (1 + er is ' s) < 

s,s'es,s'cs,s'^s 

oo 

SeS j=i S'eS,S'cS,r(S',S)=j SeS 

Now we need to estimate these sums 

^■:=El^W)| 2 E \K,Q s > H »(Xs\s)\\l' J = 1,2,3,..., 

SeS S'eS,S'CS,r(S',S)=j 



^o:=El^W)| 2 ||P,,Q s ^(X5\5)ll'- 
ses 

By the way, F = DP. 

All such sums have the form of Carleson imbedding theorems. So we need to check 
countable number of Carleson conditions now. 

Carleson condition for Fj. We introduce the sequence 

a s := \K,Q s H»( Xs \ s )\\l S,S E S,r(S,S) = 1 . 



8. Difficult terms and several paraproducts 
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And also 
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\K, Qs ,H,(xs\s)\\l KSj) = 1, 3 = 1,2,3, 

S'€S,S'cS,r(S',s)=j 



We will need the following Lemma. 

Lemma 8.2. Let A' C A C B be intervals ofT>^. Let the tree distance between A' and A 
with respect to the tree satisfy t(S', S) > j, j = 0, 1, 2, ... Then 

||P,,a'(^Xb\a)|| 2 < C2-^(A')(P AXB \Adfi) 2 . 

Proof. Let \\if)\\ v = 1. Let us consider the term (H^(xb\a), Aj^)„, J G Qa 1 - Exactly this 
quantity was estimated in (I7.4p . We get 

\{H^ Xb \a),^A)A < C,K^) 1/2 ||A^||,(|^) 1/2 P A ( XB \ A ^). 
So each our projection can be estimated as follows 



\k,qMxb\a)\\1 < (Pa(xb\a) rf/i) 2 Yl "( J )rn ■ ( 8 - 10 ) 

Jgood,JcA' ' ' 



So \\F V)A/ (H^xb\a\\1 is bounded by 



[PA(XB\A)drfJ2 E ^) 

t=j |J|=2-*|A|, JCA 



HI 

LAI 



which proves the lemma. □ 

We first establish a Carleson property for {a s }. Let I be in V^. We choose first the 
smallest stopping interval containing (it might be equal to) /. We call it 5* abusing the 
notations slightly. Consider the family of its stopping sons {S a } af zA such that S a C /. Using 
our notations for father in the stopping tree S we can write 

S a = S Va G A . 

There can be a case that such family consists of one interval (call it So) and Sq = I. Consider 
this case later. Now we assume that all S a ,al G A are strictly smaller than /, and therefore 

F(S a ) c5VqgA. 

Notice that 

(Ps(Xs\ F{Sa )) d^) 2 u(F(S a )) < Kfi(F(S a )) Vaei. (8.11) 

But this is not true with replacing F(S a ) by S a \ Let us use naively (18.1 lj) and Lemma I5T21 
Then we get 

\K,s a (H,Xs\sJl <2j2\K,s a (H,Xs\ F (sJl + 2]C \K,s a (H,XF(s a) \sJl < 

aeA a&A cteA 
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2K KHSa)) + 2 II ( H MXF {Sa \\l < + C x ) KHSa)) ■ 

In other words we would like to conclude that 

J2\K,Q Sa H,(xs\ Sa )\\l<Cm- (8-12) 

But instead, by naive reasoning we achieved 

E \K,Qs a H»(XS\ S Jt < C Y.v(F(S a )) . (8.13) 

This is a dangerous place because while the intervals S a are pairwise disjoint, there fathers 
F(SaS are usually not and we cannot deduce (I8.12p from (I8.13p . as this is not guaranteed 
that 

X>(*W) < c/i(/) . 

a€A 

This actually is usually false. 

However, (18.121) is true. But the way to prove it is more subtle. Let us do it. Let {Fp}p eB 
denote the family of maximal intervals among {F(S a )} a& A- Let for a given (3 G B the family 
{S/3 n } denote all intervals from {S a } a ^A that lie in Fp. Now 

E II P ^(^XSVJII» = EE H P ^(^5\5,, T )II' < 

2 E E 11^(^^)112 + 2 E E \K,s^(H»Xf p \sJ\\1 =■■ Si + S 2 . 

/3GB 7 /3G_B 7 

For the second sum: ^ 7 l|P^, 7 (^X^\^, 7 )ll" < 2 E 7 ll^, 7 (^W,) 112+2 E 7 \\H„XS Jl < 
C x fi(Fp) by our Sawyer's type test assumption (12.201) . Also we can use now the disjointness 
of Fp to conclude that 

S 2 <C/x(/). 
For the first sum we use Lemma 18.21 to conclude 

Si < EE( P ^V^) 2 ^, 7 ) <E£(^%/rfM^) <Kj2KFp) < Kfi(I) . 

PeB 7 0£B 7 

We used here the disjointness twice. 

Finally (I8.12p is proved. But to prove the estimate of Carleson type for {asjsgs we need 
not just (18TT2D but 

E \K, Qs H,(xs\ s )\\l < C . (8.14) 
Ses,F(S)ci 

We estimated not the whole sum above but only the sum over maximal S such that S G 
S, F(S) C I. By the way now it is time to return to the last case: when So — I (see above). 
Notice that in this case we also estimated 

E \K,Qs a H»(Xs a \ Sa )\\l < Cn{I) • (8.15) 

Sa&S ,F(S a )dI ,S a is maximal 
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But the standard reasoning shows that ( I8.15P is enough to prove (I8.1?p ! In fact, if our S in 
the sum in (I8.17P is not maximal it is contained in a maximal one. Denoting by Sj(a) the 
maximal such S contained in S a we conclude 

3 

We sum over j and a and notice that our main stopping property says 

5>(Sa) < A*(J) ■ 

a 

This gives the sum over maximal intervals inside maximal intervals. Next generation of 
stopping intervals will give a contribution because 

EE^>))^E^)^ 7 )> 

a j a 

yet next generation will come with the contribution et cetera... All this is because of 

Theorem 17.21 And we obtain (I8.17p . 
This gives 

DP = F <C H/ll \. (8.16) 

We are left to estimate ODP. 

8.3. Miraculous improvement of the Carleson property of the se- 
quence {a j s }ses 

We used Lemma 18.21 above. But we used it only with j = 0. Now we will be estimating 
Carleson constant for {a J s }s£s an d it should be exponentially small. We will use again 
Lemma [8.21 but with j > 0. Recall that r(S', S) denote the tree distance between these two 
intervals inside the stopping tree. We again consider I e X> M , the smallest S G S containing 
/. We need now the estimate 

E E ll P ^^(Xs\s)ll* < C2- C MJ) • (8-17) 

S£S,F(S)cI S'cS,r(S',S)=j 

We repeat verbatim the reasoning of the previous section, and of course 2~ J appears 
naturally from Lemma IH721 We just use the fact that intervals S' involved in f u ,Q s , have the 
property 

t(S',S)>r(S',S)>j. 

The only place where one should be careful to get the extra 2~ CJ is the estimate of E 2 . 
We cannot use 

E E 11^(^^)112 < 

7 S'cS p , y) r{S',S^)=j 
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2 E E \K,s>(H, XFb )\\1 + 2j2\\H,Xs Jl < C X »(F P ) 

7 S'cS ^r(S',Sf,^)=j 7 

anymore. Actually we can say that but this does not give extra 2~ Cjf . Instead, by Lemma 

E \\v»AKpXf,\b,„)\\1 < 

7 S"CS 8 , 7 ,r(S',S s , 7 )=j 

C2 ' j E E v{S>){Ps, ri XF p \s p jtf < 

7 5'c5^, 7 ,r(5',S 3 , 7 )=j 

C2-^K^,7)(^, 7 XF /3 \5 3 , 7 ci/i) 2 

7 

For a fixed /3, the intervals Sp }1 are disjoint by their construction (see above). It is time to 
use (17. 6p . (We use it here for the second time in our proof, the first one was in Theorem 
17.21 ) If we apply (17. 6p to the last sum, we get 

^2u(S 0)1 )(P S3 ^XF p \s^-,dfi) 2 < P{i(Fp) . 

7 

Therefore, 

E \K,s>(H, XFAS ^)\\l < c2-*n{F p ) • 

We already said that all terms, in particular, the analog of the sum Si also get 2 _J factor. 
This is nice as we get 

4<c2^(J). 

SeS,F(S)Cl,Sis maximal 

Now we again need to estimates the whole sum 

J2 4<c2~>(/). (8.18) 

SeS,F(s)ci 

This achieved exactly as before with the help of (I7.10P of Theorem 17.21 We consider S a 
to be maximal S G S,F(S) C I, and then for a fixed a consider Sj(a) to be maximal 
S eS,F(S) c S a . 

Next generation of stopping intervals will give a contribution |2~*'//(/) because 

EE^(«)) < ^I> 5 «) ^ \^ ' 

a j a 

yet next generation will come with the contribution j2~^(I) et cetera... And we get (I8.18p . 
All this is because of Theorem 17.21 
Theorem 12.11 is completely proved. 
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